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(54) Tide: TAG REAGENT AND ASS AY METHOD 
(57) Abstract 

A reagent comprises: a) an analyte moiety ccnnprising at least two 
analyte residues, and linlced to; b) a tag moiety comprising one or more 
reporter groups adapted for detection by mass spectrometry, wherein 
a l e por t er group designates an analyte residue, and tiie reporter group 
at each position of the tag moiety is chosen to designate an analyte 
residue at a defined position of the analyte moiety. A plurality of such 
reagents, each comprising a different analyte moiety, provides a library 
of reagents which may be used in assay methods involving a target 
substance. Analysis of die tag moieties indicates the nature of the analyte 
naoieties bound to the target substance. A method of sequendog nucleic 
add employs a library of the reagents to detennine the sequence of a 
target nucleic add. 
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TAG REAGENT AND ASSAY METHOD 

In biological and chemical analyses, the use 
of analyte molecules labelled with reporter groups is 
5 routine. This invention addresses the idea of 

providing reagents having at least two analyte groups 
linked to one or more reporter groups. Such reagents 
can be used, in ways described below, to generate much 
more analytical information than can simple labelled 

10 analytes- It is possible to code reporter groups so 
that reagents carrying multiple analyte groups and 
multiple reporter groups can by synthesised 
combinatorially and used simultaneously and the 
reporter groups resolved in the analytical stage. 

15 WO 93/06121 (Affymax) describes a synthetic 

oligomer library comprising a plurality of different 
members, each member comprising an oligomer composed of 
a sequence of monomers linked to one or more identifier 
tags identifying the sequence of monomers in the 

20 oligomer. The linkage between the oligomer and the 
identifier tag preferably comprises a solid particle. 
The identifier tag is preferably an oligonucleotide. 

In one aspect the present invention provides 
a reagent comprising 

25 a) an analyte moiety comprising at least two 

analyte residues, and linked to 

b) a tag moiety comprising one or more reporter 

groups adapted for detection by mass spectrometry. 

wherein a reporter group designates an 
30 analyte residue, and the reporter group at each 

position of the tag moiety is chosen to designate an 
analyte residue at a defined position of the analyte 
moiety. 

Preferably the analyte moiety is linked to 
25 the tag moiety by a link which is cleavable, e.g. 

photocleavable. There may be provided a linker group 



PCT/GB94/0167S 

- 2 - 



10 



15 



20 



25 



30 



35 



to which the analyte moiety and the tag moiety are both 
attached. Preferably the analyte moiety is a chain of 
n analyte residues, and the tag moiety* is a chain of up 
to n reporter groups, the reporter group at each 
position of the tag chain being chosen to designate the 
analyte residue at a corresponding position of the 
analyte chain, n is an integer of at least 2, 
preferably 3 to 20. 

The invention may be used for the detection 
of all analytes of interest. These include, but are 
not limited to, a protein/peptide chain so that the 
analyte residues are amino acid residues; a nucleic 
acid/oligonucleotide chain so that the analyte residues 
are nucleotide residues; a carbohydrate chain so that 
the analyte residues are sugar residues. Additionally 
the analyte may be a class of small molecules with 
biological, pharmacological or therapeutic activity. 
For example it could be a core molecule with the 
ability to vary various substituent groups eg. alkyl, 
esters, amines, ethers etc in a combinatorial manner 
with mass spectrometry tags. 

The tag moiety and/or the or each reporter 
group in it is capable of being observed/detected/ 
analysed so as to provide information about the nature 
of the analyte moiety, and/or the analyte residues in 
it. 

In one embodiment, the reagent has the 
formula A - L - R where A is a chain of n analyte 
residues constituting the analyte moiety, L is the 
linker, R is a chain of up to n reporter groups 
constituting the tag moiety, and n is 2 - 20, wherein 
the tag moiety contains information defining the 
location of analyte residues in the analyte moiety. 

The tag moiety consists of one or more 
reporter groups distinguishable by mass and thus 
capable of being analysed by mass spectrometry. The 



reporter groups may be chemically different and thus 
distinguished from one another by molecular weight. Or 
the reporter groups may be chemically identical', but 
distinguished from one another by containing different 
isotopes (e.g. ^^C/^^C and ^H/^H as discussed below). 
The tag moiety is, and/or the reporter groups are, 
suitable or adapted for analysis by mass spectrometry 
e.g. after cleavage by photochemical or other means 
from the reagent. 

The advantages of mass spectrometry as a 
detection system are: its great sensitivity - only a 
few hundred molecules are needed to give a good signal; 
its wide dynamic range and high resolving power - 
molecules in the mass range 100 to 200,000 Daltons can 
be resolved with a resolution better than 0.01; its 
versatility - molecules of many different chemical 
structures are readily analysed; the potential to image 
analytes by combining mass spectrometry with, for 
example, scanning laser desorption: and the ability to 
make quantitative as opposed to merely qualitative 
measurements. 

Thus mass-labelling combines advantages of 
radioactivity and fluorescence and has additional 
attributes which suggest novel applications. 

In another aspect, the invention provides a 
library of the above reagents, wherein the library 
consists of a plurality of reagents each comprising a 
different analyte moiety of n analyte residues. For 
example, the library may consist of 4^ reagents each 
comprising a different oligonucleotide chain of n 
nucleotide residues. The reagents of the library may 
be present mixed together in solution. 

In another aspect, the invention provides an 
assay method which comprises the steps of: providing a 
target substance; incubating the target substance with 
the said library of reagents under conditions to cause 
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at least one reagent to bind to the target substance; 
removing non-bound reagents; recovering the tag- 
moieties of the or each bound reagent; and analysing 
the recovered tag moieties as an indication of the 
g nature of the analyte moieties bound to the target 
substance. 

The target substance may be immobilised, as 
this provides a convenient means for separating bound 
from non-bound reagent. In one aspect, the target 
substance may be an organism or tissue or group of 
cells, and the assay may be performed to screen a 
family of candidate drugs. In another aspect, the 
target substance may be a nucleic acid, and this aspect 
is discussed in greater detail below. 
-,5 Reference is directed to the accompanying 

drawings in which: 

- Figure 1 is a general scheme for synthesis 
of reagents according to the invention. 

- Figure 2 shows reagents with three 
different systems of tag chains containing reporter 
groups . 

- Figure 3a is a diagram showing synthesis 
of coded oligonucleotides, and 

- Figure 3b is a diagram showing reading the 
code of a tag chain. 

- Figure 4 is a diagram showing sequence 
analysis by progressive ligation. 

- Figure 5 is a diagram on extending the 
sequence read by hybridisation to an oligonucleotide 
assay. 

Legends to Figures 1 and 2 are included at 
the end of this specification. 

Reference is directed to the example 
applications below, describing how the method may be 
applied to the analysis of nucleic acid sequences, and 
to screening candidate drugs. 
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Synthesis of codert i-a.;|rff 

The principle of the method used for tagging 
multiple analytes simultaneously is similar to that 
g proposed by Brenner and Lerner (1992) for coding 

peptides with attached nucleic acid sequences. The 
intention of their idea is to add a tag which can be 
amplified by the polymerase chain reaction and read by 
sequencing the DNA molecule produced. 
10 "^^^ structure of reagents is best illustrated 

by considering how they could be made. Synthesis 
starts with a bivalent or multivalent linker which can 
be extended stepwise in one direction to add a residue 
to the analyte and in another to add residue-specific 
reporter groups (Pig. 1). Suppose we wish to make a 
mixture of organic compounds, introducing different 
residues at each stage in the synthesis. For example, 
the mixture could comprise a set of peptides with 
different amino acid sequences or of oligonucleotides 
with different base sequences, or a set of variants 
with potential pharmacological activity with different 
groups attached to a core structure; in each case we 
wish to label each structural variant with a unique 
tag. This is done by dividing the synthesis at each 
step where different residues are added to the compound 
of interest, and adding corresponding residues to the 
tag . 

As an example, suppose we wish to make a 
mixture of 4096 hexanucleotides, each with a unique 
tag. Four samples of a bivalent linker would be 
coupled with each of the bases and with the unique 
reporter for the base (Fig. 3a). The four samples are 
then mixed, divided in four and the process repeated. 
The result is a set of dinucleotides each with a unique 
tag. The process is repeated until six coupling steps 
have been completed. 
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The linXer and reporter Qroups 

The linker should have one group that is 
compatible with analyte synthesis - hydroxyl, amino or 
sulphydryl group are all suitable for initiating 
oligonucleotide synthesis, and similar groups can be 
found to initiate other pathways, for example, 
synthesis of polypeptides. For some classes of 
compounds it may be desirable to start with a "core" 
compound which forms part of the analyte. The choice 
of the group(s) for starting addition of reporters 
depends on the nature of the reporter groups and the 
chemistry used to couple them. This chemistry has to 
be compatible with that used for synthesising the 
analyte. For the example of oligonucleotide synthesis, 
there are a number of alternatives. The established 
method uses benzoyl and isopropyl groups to protect the 
bases, acid-labile trityl groups for temporary 
protection of the 5'-0H groups during coupling, and 
p-cyanoethyl groups to protect the phosphates. The 
method used for coupling the reporters should not 
attack these protecting groups or other bonds in the 
oligonucleotide, and the synthesis of the tags should 
not be affected by the coupling, oxidation, and 
deprotection used in the extension of the 
oligonucleotide . 

The coupling of the reporter monomers or the 
capping of the chain, may be incomplete at each step 
(Fig. 2, B and C) , so that the analyte is coupled to a 
nested set of reporter structures. This will make it 
easier to deduce the structure of the analyte from the 
composition of the tag (Fig. 1; Pig. 3). To make the 
synthesis easier it is desirable for the linker to be 
attached to a solid support by a linkage which can be 
cleaved without degrading the analyte or the reporter 
groups. Alternatively, the linker may carry a group 
such as a charged group or a lipophilic group which 



enables separation of intermediates and the final 
product from reagents. 

The reporter groups could take many forms, 
the main consideration is the need to read the 
composition or sequence of the tag by mass 
spectrometry. Possibilities include groups with 
different atomic or formula weights, such as aliphatic 
chains of different lengths or different isotopic 
composition. Using isotopically labelled methylene 
groups, it is possible to assign a group of unique 
formula weight to each of four different reporters 
(Table 1) . 



Table I 

Example reporters based on isotopes of hydrogen and carbon: 

^ = _ 

Isotopic Foimula Weight 

Composition (of -OCH2) 



Symbol 


Base 


''^o 


A 


'•si 


C 




G 


'•33 


T 



^^C Hj 30 

^^CHD.^CHj 31 

^^CDa.^^CHD 32 

^^C Dj 33 



Taking the example of oligonucleotides these 
tags can make a set which allows the base at each 
position in the oligonucleotide to be read from the 
incremental masses of the partial products in the 
series (Table 2). All oligonucleotide sequences will 
give a unique series of tag fragment weights provided 
the smallest increment in adding a reporter is larger 
than the mass difference between the smallest and the 
largest reporter. 
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Example oligonucleotide with isotopic reporters: 



G.A.T<:.T.A •.•p-.r,,-r,,-r,rr,,-r,,-r,, 

30. + 63, + 94. + 127. + 157. + 190 

Pm photolabile linker - formula irelght ot P 



Formula weights 
of partial 
products 



For mass spectrometry, it will be desirable 
to have a simple way of cleaving the tag chain from the 
analyte. There are several possibilities. Among 
methods compatible with oligonucleotide and peptide 
analytes are: light induced cleavage of a photolabile 
link; enzymatic cleavage, for example of an ester 
link; free-radical induced cleavage, 

A further requirement is that the tags should 
be compatible with the chemical and biochemical 
processes used in the analysis: for the example of 
oligonucleotides used in molecular hybridisation or for 
one of the proposed sequencing methods, they must be 
soluble and they must not inhibit certain enzymatic 
reactions which may be used in the analysis. 
Experience has shown that oligoethylene glycol 
linkages, similar to the methylene analogues shown in . 
Table 1, are compatible with molecular reassociation of 
oligonucleotides. Furthermore, such linkages are 
compatible with at least some enzymatic reactions as we 
have shown that oligonucleotides tethered to glass 
through a hexaethylene glycol linker can be converted 
to a 5 '-phosphomonoester by treatment with 
polynucleotide kinase and ATP. 
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Desirable nronertiies of the Linker 

For the applications envisaged, it is 
desirable that the linker molecule has the following 
properties : 

It should be possible to link it to a solid 
support to allow for synthetic cycles to produce the 
analyte and corresponding tags to proceed without the 
need for cumbersome purification of intermediates. 
Following synthesis cycles, the linker should be 
removable from the solid support under conditions which 
leave the analyte and tags intact. The functional 
group for tag synthesis should be such that it allows 
for the ready synthesis of tags which are 
distinguishable from each other by mass spectrometry. 

The linker should have protected functional 
groups that allow for the extension of the analyte and 
the tags separately, under conditions in which the 
chemistry for one does not interfere with that of the 
other. 

The linker should preferably carry a charged 
group so that mass spectrometry can be carried out in 
the absence of a matrix. Further to this aim, it is 
desirable that the tags should comprise compounds which 
are volatile enough to evaporate in the mass 
spectrometer, without recourse to complex techniques 
such as the electrospray • The tags should either 
produce stable ions or ions which fragment to 
characteristic patterns that can be used to identify 
the corresponding analyte. 

The link between the tag and analyte should 
preferably be photocleavable, so that tags can be 
directly cleaved in the mass spectrometer by laser 
irradiation, and further cleavage to remove them 
completely to allow biochemical steps such as ligation, 
can be carried out conveniently by exposure to a lamp. 

The linked products should preferably be 
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soluble in aqueous solvents, so that they can be used 
in biochemical reactions. 

The examples described herein show linkers 
with these desired properties. 

PhQtQgJngg^vabJ.^ grpup 

The photocleavable group has been based on 
the known photolabile o-nitrobenzyl group. This group 
has been used as a protecting group for both the 
phosphate group and 2 ' hydroxy group in oligo nucleotide 
synthesis [see the review by Pillai Synthesis 1 
(1980)]. In itself the o-nitrobenzyl group lacks 
further functionalisation for subsequent attachment of 
a linker between tags and analyte. Available from 
commercial sources is the compound 5-hydroxy-2- 
nitrobenzyl alcohol. It is known that OMe groups can 
be added in the 5,4 position without significant 
reduction in photolabile properties (see Pillai 
review) . Thus, the 5-hydroxy-2-'nitrobenzyl alcohol was 
used as a starting point with the aim of extending DNA 
synthesis from the benzyl alcohol and the linker chain 
to the tags from an ether coupling at the 5-hydroxy 
group. 

The requirement is for a functional group 
to be present to permit the combinatorial synthesis of 
analytes and tags. A linker arm is therefore required 
from the photocleavable group to the required 
functional group for tag synthesis. It is also a 
preferment that the combinatorial synthesis be carried 
out on a solid support. Thus, the linker arm must be 
bivalent in functional groups and have orthogonal 
protecting groups to permit selective synthetic 
transformation. Preferred tag reagents contain glycol 
linkages/ether linkages. For synthesis 
oligonucleotides are normally linked to a long chain 
amino CPG support via the 3 ' hydroxy and a succinic 
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ester link. Thus the functional groups required were 
deemed to be alcohols. 

The following intermediate compound has been 
synthesised. 



This comprises an aromatic linker carrying: 
a methoxytrityl group {-CH2ODMT) for analyte 

synthesis; 

an o-nitro group for photocleavage; 

an 0-t-butyl diphenyl silyl group (OTBDPS) 
for tag synthesis; 

a tertiary amine group for conversion to a 
positively charged group for analysis by mass 
spectrometry; 

and an N- hydroxy succinimidyl group for 
attachment to a support. 

When the analyte is a peptide only minor 
modifications to conditions need be considered. The 
2-nitrobenzyl group is stable under most of the 
conditions of peptide synthesis and it and related 
analogues have already been used as photo labile groups 
in peptide synthesis (see Pillai review and the 
references contained therein) . There are already 
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several resins suited to peptide synthesis with 
different modes of cleavage. The orthogonal protecting 
groups for analyte and tag synthesis would be based 
on t-butoxycarbonyl and 2-inethoxyethoxymethyl . The 
t-butoxycarbonyl group would be used to protect the 
amino group in the amino acids with cleavage being 
effected by a trif luoroacetic acid treatment. The 2- 
methoxyethoxymethyl would be used to protect the 
tagging groups and the tags based on mass diffentiated 
on 1, n alkyldiol derivatives as before. The cleavage 
of t-butoxycarbonyl groups has been shown to be 
compatible with the 2-methoxyethoxymethyl protecting 
groups. The 2-methoxyethoxymethyl protecting groups 
can be selectively cleaved with zinc bromide in 
dichloromethane. While the above illustrates the 
procedure those skilled in the art will recognise 
that this set of orthogonal protecting groups is by no 
means limiting but serves as a representative example. 

Detection and analysis of reporters > 

Photocleavage is the favoured method of 
releasing tags from analytes; it is fast, can be 
carried out in the dry state, and scanning lasers can 
be used to image at a very small scale, small enough to 
image features within cells (de Vries et al . . 1992), so 
that the proposed method could be used to detect the 
positions of specific analytes that had been used to 
"stain" the surface or the insides of cells, or 
different cells in a tissue slice> such as may be 
required to image interactions between ligands, e.g. 
candidate drugs, and their receptors. 

Photosensitive protecting groups are 
available for a very wide range of chemical residues 
(reviewed in Pillai, 1980]. The photolabile o-nitro 
benzyl group which can be used as a protecting group 
for a wide range of compounds forms an ideal starting. 
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point for a linker for many analytes that could be 
envisaged, peptides and oligonucleotides among them. 
Taking the example of oligonucleotides, it provides a 
photosensitive link that can be broken quantitatively 
to give a hydroxyl group. This will permit the 
deprotected oligonucleotide to take place in the 
ligation extension as described in the sequencing 
method below. Furthermore, the group is known to be 
stable during oligonucleotide synthesis, it would be 
necessary to modify the benzyl ring to provide a group 
that can be used to initiate the synthesis of the tags; 
reporters such as the oligoethyleneglycol series 
described above do not interfere with the photochemical 
cleavage reaction of the o-nitrobenzoyl group (Pillai 
^P' git .). Other groups can be added to the aromatic 
ring which enhance the cleavage; such groups could be 
exploited to add a charged group(s) to simplify 
analysis in the mass spectrometer. Modern mass 
spectrometers are capable of measuring a few hundred 
molecules with a resolution better than one Dalton in a 
hundred, up to a total mass of 200 kD. A preferred 
photolabile linker may be represented thus; in which 
the positively charged group R may be directly attached 
to the aromatic ring or may be present in one of the 
linker arms: 



30 



r 



[Nucleotide-J^ 



oca 




NO. 



R a positively charged group 



J 



35 
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IPS tyumentat AQft . 

The proposed molecular tags would be analysed 
by one of several forms of mass spectrometry. For many 
purposes, although it will.be desirable to cleave the 
tags from the analytes, it will not be necessary to 
fragment the tags, and indeed it may be undesirable as 
it could lead to ambiguities. Recent developments in 
mass spectrometry allow the measurement of very large 
molecules without too much fragmentation; and as it is 
possible to design the linker so that it is readily 
cleaved, under conditions where the rest of the tag is 
stable, fragmentation of the tag during measurement 
should be avoidable. The analyte group will, in most 
cases, be less volatile than the tag, and in many 
applications will be bound to a solid substrate, and 
thus prevented from interfering with mass spectrometry. 

The linker illustrated above is very labile 
to photon irradiation under conditions which will cause 
no cleavage of the great majority of covalent chemical 
bonds. A suitable instrument has been described [de 
Vries et al. . 1992]. This uses a laser that can be 
focussed down to a spot smaller than 1 pm. Images of up 
to 250mm are scanned by moving a stage that can be 
positioned to 0.1 }im. 

This instrument also allows for ionisation of 
the species to be measured by shining an ionising laser 
across the surface of the stage so that it interacts 
with the species lifted by the desorption laser. This 
could be useful for the present method if it were not 
possible to include a charged residue in the tags, or 
if fragmentation is desirable for reading the tags. 

In another aspect the invention provides a 
method of sequencing a target nucleic acid, which 
method comprises the steps of: 

a) providing an oligonucleotide immobilised on a 

support. 
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b) hybridising the target nucleic acid with the ■ 
irotnobilised oligonucleotide, 

c) incubating the hybrid from b) with the 
library as defined in which the reagents are mixed 
together in solution, so that an oligonucleotide chain 
of a first reagent of the library becomes hybridised to 
the target nucleic acid adjacent the immobilised 
oligonucleotide , 

d) ligating the adjacent oligonucleotides, thus 
forming a ligated first reagent, 

e) removing other non-ligated reagents, and 

f) recovering and analysing the tag moiety of 
the ligated first reagent as an indication of the 
sequence of a first part of the target nucleic acid. 

Example applications 

We illustrate potential applications by 
referring to ways in which coded oligonucleotides could 
be used in nucleic acid analysis. 



1 . Nucleic acid sequence determination by progressive 
ligation. (Pig. 4) 

The sequence to be determined is first 
hybridised in step b) to an oligonucleotide attached to 
2g a solid support. If the DNA to be sequenced has been 
cloned in a single strand vector such as bacteriophage 
M 13 , the "primer" oligonucleotide on the solid 
support can be chosen to be part of the vector 
sequence. In step c) , the solid support carrying the 
hybrids from step b) is incubated with a solution of 
the coded oligonucleotide reagents, e.g. with the 
aforesaid library, comprising all sequences of a given 
length, say 4096 hexanucleotides (4*^ n-mers, . iri 
general), in step d), ligase is introduced so that 
the hexanucleotide complementary to the first six bases 
in the target DNA is joined to the immobilised primer 
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oligonucleotide. By this step a first coded 
oligonucleotide reagent from the library is joined, by 
ligation of its oligonucleotide chain to the 
immobilised primer oligonucleotide, and is herein 
referred to as a ligated first reagent. 

In step e), non-ligated reagents are 
removed, e.g. by washing. In step f), the linker of 
the ligated first reagent is broken to detach the tag 
chain, which is recovered and analysed as a indication 
of the sequence of a first part of the target DNA. 

Preferably, removal of the linker also 
exposes a hydroxyl or phosphate group at the end of the 
first oligonucleotide chain, making it available for 
ligation with the oligonucleotide chain of a second 
reagent. Several methods for breaking the linker, 
including photochemical and enzymatic and chemical 
hydrolysis, can be used to generate the 3 '-hydroxyl 
or 5 '-phosphate group needed for further ligation. 
Steps c), d), e) and f) are then repeated. These 
steps involve hybridisation of a second reagent from 
the library, ligation recovery and analysis of the tag 
chain of the ligated second reagent, and generation of 
another 3 '-hydroxyl or 5 '-phosphate group needed for 
further ligation. The process can be repeated until 
the whole DNA sequence has been read or until yields in 
the reaction become too low to be useful. 

Four stages of this sequence are shown 
diagraramatically in Figure 4. The first diagram 
corresponds to the situation at the end of step e) 
first time round. The second diagram corresponds to 
the situation at the end of step f ) . The third diagram 
corresponds to the position at the end of step c) 
second time round. The fourth diagram corresponds to 
the situation at the end of step d) second time round. 
The cyclic nature of the technique is indicated. 
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2. Nucleic acid sequencing of multiple templates by 
sequential ligation. 

In an extension of the first example, it is 
envisaged that many sequences could be analysed 
5 simultaneously. For example, individual clones of the 
DNA to be sequenced could be immobilised: 

a) Use can be made of an array of pins with the 
same vector oligonucleotide immobilised on the end of 
each. An individual clone of the target DNA is 

10 hybridised to the oligonucleotide immobilised on each 
individual pin. The array of pins carrying these 
hybrids is then incubated with the library of coded 
oligonucleotide reagents in a solution which also 
contains the ingredients for ligation. As a result of 

15 this step, each pin carries a different ligated 
reagent. Finally, the tag chain of each ligated 
reagent is recovered and analysed as before. If the 
pins of the array are suitably spaced, they may be 
dipped into the wells of microti tre plates, the first 

20 plate containing the templates to be sequenced, the 
second the library of reagents and ligation solution, 
and the third plate containing a reagent for cleaving 
the tag chains from the pins. 

b) Alternatively, a surface may be coated with 
25 the primer oligonucleotide, preferably covalently 

attached through its 5' end or alternatively at some 
other point. Individual clones of the DNA to be 
sequenced are spotted at spaced locations on the coated 
support, so that each individual clone of the target 

30 DNA is hybridised to the oligonucleotide immobilised at 
an individual spaced location on the support. The 
support is then incubated with a solution containing 
the library of reagents and the ingredients for 
ligation. Non-ligated reagents are removed. Then the 

35 linker of the ligated reagent at each spaced location 
is cleaved and the tag recovered and analysed. 
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Cleavage is preferably effected by a method such as 
laser desorptlon which can address small areas on the 
surface. An advantage of this approach is that very 
large numbers of DNA sequences can be analysed 
together. 

3* Extension of methods for seqpience determination by 
hybridisation to oligonucleotides 

a) Format I. 

Methods for spotting DNAs at high density on 
membranes are well established {Hoheisel et al , . 1992; 
Ross et al . . 1992]. For fingerprinting and for 
sequence determination, oligonucleotides must be 
applied either singly or in small sets so that the 
hybridisation patterns are not too complex to 
interpret; as a consequence, only a small proportion of 
templates give signal at each round of analysis. If the 
signal from each hybridisation contained coded 
information which allowed its sequence to be 
determined, more complex mixtures could be used and 
much more information collected at each round of 
hybridisation. The complexity of the mixture would 
depend on the length of the DNA templates and on the 
ability of the analytical method to resolve sequences 
in mixed oligonucleotides. 

Nucleic acid probes encoded with these mass 
spectrometry tags or reporter groups will be very 
valuable where the use of multiple probes is 
advantageous eg. DNA fingerprinting or mutation 
analysis. The mass spectrometry tags offer the 
advantage of multiplexing, 

A number of different probes each labelled 
with its own unique and appropriate mass spectrometry 
tag can be used together in typical nucleic acid 
hybridisation assays. The sequence of each individual 
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probe which hybridises can be uniquely determined in 
the presence of others because of the separation and 
resolution of the tags in the mass spectrum. 

In this aspect, the invention provides a 
5 method of sequencing a target nucleic acid, which 
method comprises the steps of: 

i) providing the target nucleic acid immobilised 
on a support. Preferably individual clones of the 
target nucleic acid are immobilised at spaced locations 

10 on the support. 

ii) incubating the immobilised target nucleic 
acid from I) with a plurality of the coded 
oligonucleotide reagents described above, so that the 
oligonucleotide chains of different reagents become 

15 hybridised to the target nucleic acid on the support, 

iii) removing non-hybridised reagents, and 

iv) recovering and analysing the tag moiety of 
each reagent as an indication of the sequence of a part 
of the target nucleic acid, 

20 Preferably thereafter use is made of the 

library of reagents, with the hybridisation, ligation, 
cleavage and analysis steps being repeated cyclically 
to provide additional information about the sequence of 
the target nucleic acid. 

25 

b) Format II. 

It is possible to determine nucleic acid 
sequences from the pattern of duplexes formed when they 
are hybridised to an array of oligonucleotides. The 

30 length of sequence that can be determined is 

approximately the square root of the size of the array: 
if an array of all 65,536 octanucleotides is used, the 
sequences to be determined should be around 200bp 
[Southern gt aJL > . 1992]. The limit in size is imposed 

35 by the constraint that no run of eight bases should 
occur more than once in the sequence to be determined. 
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The array and its use in sequence determination are 
described in International patent application 
WO 89/10977; and a method of providing an array of 
oligonucleotides immobilised e.g. by their 5 '-ends or 
5 their 3 '-ends on a surface is described in 
International application WO 90/03382. 

By the method of the present invention, the 
sequence length that can be determined can be greatly 
extended. In this aspect of the invention, the method 
10 comprises the steps of: 

a) Providing an array of oligonucleotides 
immobilised at spaced locations on a support, the 
oligonucleotide at one location being different from 
oligonucleotides at other locations. Preferably the 

15 sequence is known of the oligonucleotide immobilised by 
a covalent bond at each spaced location on the support, 

b) incubating the target nucleic acid with the 
array of immobilised oligonucleotides, so as to form 
hybrids at one or more spaced locations on the support, 

20 c) incubating the hybrids from b) with the 

library of coded oligonucleotide reagents, so that an 
oligonucleotide chain of a reagent of the library 
becomes hybridised to the target nucleic acid adjacent 
each immobilised oligonucleotide, 

25 ligating adjacent oligonucleotides thus 

forming ligated reagents at the one or more spaced 
locations on the support, 

e) removing other non-1 igated reagents, and 

f ) recovering and analysing the tag moiety of 
30 each ligated reagent as an indication of the sequence 

of a part of the target nucleic acid. 

Preferably cleavage of the tag chain at each 
spaced location is effected photochemically by means of 
a laser. Preferably analysis of the tag chains is by 
35 mass spectrometry. Preferably the hybridisation, 
ligation, cleavage and analysis steps are repeated 
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cyclically, as described above, so as to obtain 
additional information about the sequence of the target 
nucleic acid. 

A preferred sequence of operations is shown 
5 in the four diagrams constituting Figure 5. The first 
diagram shows the position at the start of step b) . 
The second diagram shows the position at the end of 
step b) - a portion of the target nucleic acid has 
become hybridised to a tethered oligonucleotide forming 
10 part of the array. The third diagram shows the 

position at the end of step c) , and the fourth diagram 
shows the position at the end of step d) ; a reagent 
from the library has become hybridised to the target 
nucleic acid and ligated to the immobilised 
•J5 oligonucleotide. 

The results of this extension of the known 
method are dramatic. A single extension by a length 
equal to the length of the oligonucleotides in the array 
squares the overall length that can be read, provided 
20 that the method used to read the tags can resolve 
mixtures. In this case the length that can be read 
from an array of octanucleotides extended by eight 
bases is around 60,000 bases. 

25 Comparison of hybridisation analysis with tagged 
oligonucleotides with: 

a) Gel -based methods. 

The most advanced instrument for automated 
30 sequence analysis is capable of reading around 40000 
bases per. day. This does not include the time for the 
biological and biochemical processes needed to provide 
the reactions that are loaded on the gel. if we assume 
that templates can be applied to a surface at a density 



wo 95/04160 



PCT/GB94/01675 



10 



- 22 - 

Of one per square millimeter (Hoheisel et al . , 1992; 
Ross £t_al., 1992], 10000 could be applied to an area 
of ioOxlOOmni. After hybridisation, there would be 
several fraol of tagged oligonucleotide in each cell so 
a single 2nsec pulse of the laser may release enough 
tag to read, but even if we assume that 100 pulses are 
needed, then the total time for a cell to be read is a 
few msec, so that all 10000 cells could be read in a 
few minutes. If the oligonucleotides were hexamers, the 
raw data acquired would be 60000 bases. For sequence 
determination, this would not be as informative as the 
equivalent raw data from a gel, because much longer 
continuous lengths are read from gels. This advantage 
for gels would, of course, be lost if the sequence read 
from the array could be extended by further rounds of 
analysis. But the fundamental advantage of array-based 
approaches is the parallelism which enables thousands 
of templates to be analysed together; the number that 
can be analysed on a gel is limited by the width of the 
2^ gel to less than fifty. 

b) Present array-based methods. 

The major drawbacks of existing array-based 
methods are: 

®> The sequence that can be read from an array 

of size N is only -/n, so that most cells of the array 
are empty. By adding tagged oligonucleotides, the 
occupancy of the array could be near complete, so that 
information would be obtained from most cells. The 
reason for this is that additional information from the 
tags helps remove ambiguities due to multiple 
occurrences of short strings in the target sequence 
^(Table 3) . 

b) The length of sequence that is read from each 

interaction with an oligonucleotide by hybridisation is 
necessarily limited to the length of the 



25 



30 
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oligonucleotide. This causes problems in reading 
through repeating sequences, such as runs of a single 
base. Extending the read by ligation will permit reads 
as long as can be traversed by repeated ligations, 
c) Of present detection methods, radioactivity 

has high sensitivity but poor resolution, fluorescence 
has low sensitivity and high resolution; both are 
relatively slow. The proposal to use mass spectrometry 
could improve resolution, speed and sensitivity, as 
well as adding the potential to read the sequences of 
tags . 



Table 3. In general, the sequence that can be 
determined from templates distributed on a spatially 
segmented array is «/4^ = 2^ , where L is the sum of 
the continuous lengths read by oligonucleotides. This 
would include the length of the oligonucleotide on the 
solid support in example 3b but not in example 2. 

20 



25 



L 


2^ 


12 


4096 


14 


16384 


16 


65536 


18 


2262144 



Analytes with potential pharmacological activity. 

Many drugs are tissue-specific. Their action 
often depends on interaction with a cell-surface 
receptor. There are families of drugs based on core 
structures; for example, there are several comprising 
short peptides. It is useful to be able to trace 
candidate drugs to see which cells or tissues they may 
target. It would be useful to be able to trace many 



different candidates simultaneously. Using libraries 
of analytes tagged with coded mass-tags, it would be 
possible to trace interactions by examining cells or 
tissues in the mass spectrometer. If tags were 
attached by photolabile protecting groups, it would be 
possible to image whole animal or tissue sections using 
scanning laser cleavage, coupled with mass 
spectrometry. 

The following Examples further illustrate the 

invention. 

Examples 1 to 6 show steps, according to the 
following Reaction Scheme 1, in the synthesis of a 
compound (8) comprising an aromatic linker carrying: a 
raethyloxytrityl group (-CH^ODMT) for analyte synthesis; 
an o-nitro group for photocleavage; an 0-t-butyl 
diphenyl silyl group (OTBDPS) for tag synthesis; a 
tertiary amino group for conversion to a positively 
charged group for analysis by mass spectrometry; and an 
N-hydroxysuccinimidyl group for attachment to a support. 

Examples 7 and 8 show subsequent steps 
according to the following Reaction Scheme 2. 

Examples 9 and 10 show steps, according to 
the following reaction Scheme 3, of preparing reporter 
groups (13) based on propan -1,3-diol. 

Examples 11 to 13 show steps, according to 
the following Reaction Scheme 4, involved in attaching 
a protected propan -1,3-diol residue as a reporter 
group to compound ( 6 ) . 

Examples 14 to 19 describe the preparation, 
characterisation and use of various reagents according 
to the invention. 
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General detail 

5-Hydroxy-2-nitrobenzyl alcohol was purchased from Aldrich, long chain alkylamino 
controlled pore glass from Sigma. Anhydrous solvents refer to Aldrich Sure Seal grade 
material packed under Nitrogen. Trlethyiamine was predistilled from calcium hydride and 
stored under nitrogen prior to use. Other solvents and reagents are available from a range 
of commercial sources. 

NMRs were obtained on a Jeol 270 MHz machine using the solvent indicated and 
referenced to tetramethylsilane. 

Infra Reds were obtained on a Nicolet 5DXC F.T. IR machine either as a potassium bromide 
disc or chloroform solution as indicated. 

Melting points were obtained on a Gallenkamp melting point apparatus and are uncorrected. 

Tics were run on Kieselgel 6OF254 aluminium backed Tic plates using the solvent system 
indicated. The plates were visualised by both ultra violet and/or dipping in a 3% w/v 
ethanolic solution of molybdophosphoric acid and then heating with a hot air gun. Trityl 
containing species show up as a bright orange spot, alcohols as a blue spot. 

Silica gel chromatography was performed using flash grade silica gel, particles size 40 
63fim. 

Abbreviations used in the reaction schemes and text. 



DMT 


4,4' - dimethoxy trityl 


THF 


tetrahydrofuran 


TBDPS 


/err-butyldiphenylsilane 


DMAD 


4-dimethylaminopyridine 


DCCI 


dicyclohexyldicarbodiimide 


CHjClj 


dichloromethane 


CPG 


controlled pore glass 


Mel 


iodomethane 


Tresyl 


2,2,2-trinuoroethlsulphonyl 
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EXAMPLE 1 

Synthesis of S-hydroxy - O - (4,4' - dimethoxytrityl) - 2 - 
nitrobenzyl alcohol (Compound 2, Scheme 1) 

To 5-hydroxy - 2 - nitrobenzyl alcohol (5.U g, 30.2 mmol) dissolved in anhydrous pyridine 
(40 ml) was added 4,4*- dimethoxytrityl chloride (10.25 g, 30.2 mmol) and the nask 
stoppered. The reaction mixture was then left to stir at room temperture for a total of 72 
hours. T.I.C. analysis (ether/pet. ether 40 - 60'C, 65%/35%) revealed the presence of anew 
trityl positive containing material with an Rp of 0.27 and disappearance of the starting 
alcohol. The pyridine was then removed by rotary evaporation, with the last traces being 
removed by co-evaporation with toluene (x2). The resultant gum was dissolved up in ethyl 
acetate and the solution washed with water (xl) and brine (xl). The ethyl acetate solution 
was then dried over anhydrous magnesium sulphate and evaporated to a reddish brown gum. 
The gum was dissolved in CHjClj (20 ml) and then applied to a silica gel column (14 cm x 
6.5 cm) which was eluated with ether/ pet. ether 40 - 60°C, 65%/35%. The product 
fractions were combined and the solvent removed by rotary evaporation to give an off white 
solid (13.49 g, 9556, mpt. 80 - 82"C with decomposition). An analytical sample was 
prepared by recrystallisation from chloroform/ pet. ether 40 - 60'C, mpt 134 - TC with 
decomposition. 

•H NMR (270 MHz. CDCl, «): 3.79 (s. 6H, DMT-OCH,), 4.63 (s. 2H, CH,-ODMT), 6.77 - 
6.85 (m, 5H, aryl). 7.22 - 7.49 (m, 9H, aryl), 7.63 (s, IH aryl), 8.06 (d, IH, J = 9.06 Hz, 
aryl). 



IR (KBr disc), 1610, 1509, 1447, 1334, 1248, 1090, 1060, 1033, 828 cm '. 
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EXANfPLE 2 

Synthesis of 0-(4,4' -diinethoxytrityl)-5-Il-(3-bromo-l- 
oxypropyl)]-2-nUrobenzylalcohoI (Compound 3, Scheme 1) 

To compound 2 (10.18 g, 21.6 mmol) dissolved in acetone (150 ml) was added 1,3- 
dibromopropane (11 mis, 108 mmol) and potassium carbonate (4.47 g, 32.3 mmol). The 
reaction mixture was then heated at SO^C for a total of three hours and then stirred at room 
temperature for a further 16 hours. T.l.c. analysis (ether/pet. ether 40 60*C, 60%/40%) 
showed complete disappearance of the starting material and the formation of two new trityl 
containing species; Rp 0.48 major, 0.23 minor. The acetone was then removed by rotary 
evaporation and the resultant residue partitioned between water and dichloromethane. The 
dichloromethane solution was separated and washed with brine. The dicholormethane 
solution was then dried over anhydrous magnesium sulphate and evaporated down to a gum. 
The gum was dissolved in dichloromethane 20 ml and tiien applied to a silica gel column (6 5 
cm X 14 cm) which was eluated with etiier/pet. ether 40 - 60'C, 60«/40%. The pure 
product fractions were combined and tiie solvent removed by rotary evaporation to give 
compound 3 as a white solid (8.18 g, 64%, mpt. 132 - 4»C, Rp 0.48 ether/pet ether 40 - 
60'C, 60%/40%. A small sample was recrystallised from ethyl acetate/pet. ether for 
analytical purposes, mpt. 132-4'C. 

'H NMR: (270 MHr CDQ,. S): 2.40 (m, 2H, -CH,-CH,-CH,-). 3.64 (t, 2H, J = 6.32 Hz, 

ODMT). 6.84 (d, 4H, J=8.79 Hz, DMT aryl). 7.20 - 7.50 (m. 10H.9 DMT aryl, 1 arS) 
7.68 (s, IH. aiyl). 8.1 (d,lH. J=9.06 Hz, aryl). / . / ^ 

IR (KBr disc) 1608, 1577, 1511, 1289, 1253, 1230, 1174, 1065, 1030 cm '. 
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EXAMPLE 3 

Synthesis of N-[0-(fert * butyldiphenylsilyI)-2-oxyethyI)] 
-N- (2-hydroxyethyl) amine (Compound 5, Scheme 1) 

To sodium hydride (0.76 g of a 60% dispersion in oil, 19 mmol) under was added 
anhydrous THF (15 ml)) followed by a slurry of diethanolamine (2g, 19 mmol) in THF (30 
ml) at such a rate as the evolution of hydrogen permitted. The reaction mixture was then 
stirred at room temperature for 30 minutes under N2 during which time a grey precipitate 
formed. The generated alkoxide was quenched by the addition of ten- 
butylchlorodiphenylsilane (4.95 ml, 19 mmol) followed by stirring the reaction at room 
temperature for two hours under Nj. T«lx. analysis (ethyl acetate) showed the generation 
of two new UV positive spots relative to starting niateri^, major Rp 0.05 minor Rp 0.60. 
The THF was removed by rotary evaporation and the residue dissolved in a O.IM sodium 
bicarbonate solution. The product was then extracted with ethyl acetate (x2). The ethyl 
acetate extracts were then combined and washed with brine (xl). The ethyl acetate solution 
was then dried over anhydrous magnesium sulphate and evaporated down to an oil. This oil 
was applied to a silica gel column which was elulated with a chloroform/methanol, 90%/10% 
Fractions with an Rp of 0.33 were combined and rotary evaporated to give compound 5 as 
a white crystalline solid (3.93 g, 60%, mpt. 73 75''C). A small sample was recrystallised 
from ethyl acetate/ pet. ether 40 • 60*C for analytical purposes, mpt. 76 77*C. 

•H NMR (270MHz, CDCI3, 6): 1.06 (s, 9H, «Bu), 2.13 (brs, IH, OH.DjO exchangable), 
2.78 (m, 4H, CH^NHCH-O, 3.63 (t, 2H, J=5.22 Hz, -CHiOSi-). 3.78 (t,2H. J=5.22 
Hz,CH2 OH), 7.40 (m, 6H. aryl), 7.66 (m, 4H, aryl). 



IR (KBr disc) 3100, 1430, 1114, 1080, 969, 749, 738, 707 cm'*. 



* 
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EXAMPLE h 

Synthesis of N-[0-(/^/t-butyldiphenylsilyI)-2*oxyethyl] 
-N-[0-(3(0-(4,4'-diniethoxytrityI)-l-oxymethyD-4-nitrophenyI) 
-3-oxypropyI]-N-(2*hydroxyethyI)aniine (Compound 6» Scheme 1) 

To compound 3 (7.46 g, 12.6 mmol) dissolved in l-methyl-2-pyrrolidinone (65 ml) was 
added compound 5 (8.63 g» 2S.2 mmol). The reaction mixture was then heated at 80^C for 
a total of 5 hours before being left to cool and sdr at room temperature for a further 16 
hours. T.Lc. analysis (ethyl acetate) showed the formation of a new trityl containing species, 
Rp O.Sl and residual amounts of the two starting materials. The reaction mixed was poured 
into a mixture of water (600 ml) and brine (100 ml) and the product extracted with ethyl 
acetate (3x200 ml). The ethyl acetate extracts were then combined and dried over anhydrous 
magnesium sulphate. The ethyl acetate was then removed by rotary evaporation to give a 
brown gum from which a crystalline product slowly formed. The minimum amount of ethyl 
acetate was added to dissolve up the residual gum such that the crystalline product could be 
filtered, the hydrogen bromide salt of compound 5. The ethyl acetate solution was then 
applied to a silica gel column (13 cm x 6.5 cm) which was eluted with ethyl acetate. 
Insufficient separation of residual compound 3 and the desired product was obtained from this 
column so the product fractions were combined and evaporated to a gum. This gum was 
dissolved up in the minimum of ethyl acetate necessary and applied to another silica gel 
column (14 cm x 6.5 cm) eluting using a gradient eluation, first ethyl acetate/pet. ether 40 
60^C, S0%/50% followed by ethyl acetate. The product fractions were combined and the 
solvent removed by rotary evaporation to give compound 6 as a gum. The last traces of 
solvent were removed by placing the gum under high vacuum for one hour. The yield of 
product was 7.64 g, 71%. 

>H NMR (270 MHz, CDCI3, 5): 1.04 (s, 9H, ^u), 1.97 (m, 2H, -CHaCIfcCHj-), 2.7 (m, 
6H. NCHj). 3.56 (m. 2H, CH2OH), 3.75 (m. 2H, CHjOSi), 3.78 (s, 6H, DMT-OCH3), 4.12 
(m,2H. ArOCaCHj). 4.64 (s, 2H. ArCaODMT). 6.74 • 6.85 (m.5H,aryl) 7.2 - 7.65 (m, 
20H, aryl), 8.05 (d,lH, aryl). 



IR (KBr disc), 1608, 1579. 1509, 1287. 1251, 1232, 1112, 1092, 1064, 1035, 826, 754, 
703. 613 cm K 
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EXAMPLE 5 

Synthesis of N-[0-(/e/t-butyldiphenylsilyI)-2-oxyethyll-.N- 
[0-(3-(0-(4,4*-diincthoxytrityl)-l-6xymethyI)-4-nitrophenyD 
-3-oxypropyI]-N-[0-(3-carboxyIatoproplonyI))-2-oxyethyll 
amine (Compound 7» Scheme 1) 

To compound 6 (S.64 g, 6.59 mmol) dissolved in anhydrous dichloromethane (40 ml) and 
anhydrous pyridine (50 ml) was added succinic anhydride (2.06 g 20.6 mmol) and 
dimethylaminopyridine (210 mg, 1.72 mmol) and the flask stoppered. The reaction was then 
stirred at room temperature for a total of 72 hours. T.l.c. analysis (methanol/ethyl acetate, 
10%/90%) showed the formation of a new trityl containing species, Rp 0.4S and the 
disappearance of the starting material. The solvent was removed by rotary evaporation with 
the last traces of pyridine being removed by co*evaporation with toluene (x2). The resultant 
gum was then partitioned between chloroform and water. The organic phase was separated 
and the aqueous phase further extracted with chloroform (xl). The organic phases were then 
combined and washed with brine (xl). The chloroform solution was then dried with 
anhydrous magnesium sulphate and evaporated to a gum. The last traces of solvent were 
then removed by placing the gum under high vacuum for one hour to give compound 7, 
6.75 g. This product was used in the next step without further purification. 

»H NMR (270 MHz. CDCl,, 5): 1.0 (s, 9H. »Bu). 1.9 (m, 2H, CHjCHjCH^). 2.5 (m, 4H, 
COCHjCHjCOOH), 2.7 (m, 6H, N-CH:), 3.7 (m, 2H, CH20Si), 3.75 (s, 6H, DMT-OCHj), 
4.1 (m. 4H, CHjOCO and Ar-OCHjCH,, 5.6 (s, 2H, ArCaODMT), 6.7 (d, IH. aryl), 6.8 
(d,4H, aryl) 7.2 - 7.7 (m, 20H. aryl). 8.02 (d, IH, aryl). 



IR (CHCI3 solution) 1736, 1608, 1579, 1509, 1288, 1251, 1232, 1175, 1158, 1112, 1093, 
1065, 1035, 755, 703 cm '. 
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Synthesis of N-[0-/e/f-butyIdiphenyIsUyI)-2-oxyethyl]-N- 
(0-(3-(0-(4,4-diinethoxytrUyI)-l-oxyincthy!)-4-nitrophenyD-3- 
oxypropyI]-N-((0-(succlnyl(3-carboxylatopropionyI)))-2- 
oxyethyl] amine (Compound 8, Scheme 1) 

To compound 7 (2.99 g, 3.13 mmol) dissolved in anhydrous dichloromethane (30 ml) was 
added dicyclohexylcarbodiimide (0.710 g, 3 AS mmol) and N-hydroxy succinimide (0.396g, 
3.44 mmol) and the flask stoppered. The reaction mixture was then allowed to stir at room 
temperature for 18 hours during which time a white precipitate formed. The white 
precipitate was filtered off and the dichloromethane solution washed with water (xl) and 
brine (xl). The dichloromethane solution was then dried over anhydrous magnesium sulphate 
and the solvent rotary evaporated off to give a foam, 3.26 g (99%). T.l.c. analysis (ethyl 
acetate) showed only one trityl containing species, Rp 0.74 and no significant containment. 
Attempts to provide an analytical sample by passing a small amount of material down a silica 
gel column resulted in the decomposition of the active ester back to the acid (Compound 7). 
The material was therefore used in all further equipments without further puriflcation. 

'H NMR (270 MHz, CDCl,, «): 1.04 (s, 9H, 'Bu), 1.97 (m, 2H, CH,CH,CH,), 2.50 -* 2.75 
(m, 6H, succinyl CH, + -OCCHa). 2.76 - 2.86 (m, 6H, NCIW, 3.08 (m, 2H, 
CHjCO, succinyl), 3.77 (s. 6H, DMTOCH,), 3.86 (m, 2H, CH,OSi), 4.1 4.2 (m, 4H, 
Ar(X:iJa + CHjOjC), 4.63 (s, 2H, ArCHaODMT, 6.7 - 6.9 (m, 5H, aryl), 7.01 - 7.7 (m. 
20H. aiyl) 8.05 (d, IH, aiyl). 



IR(KBrdisc), 1742, 1713, 1509. 1288, 1251, 1211, 1175, 1090, 1067 cm'. 
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EXANfPLE 7 

Derivatlsed long chain alkylamlno controlled pore 
class (Compound 9» Scheme 2) 

Long chain allqriamino controlled pore glass (Sigma Chemical Cb, 3.Sg) was pre-tieated with 
trichloroacetic acid (l.5g) dissolved in dichloromethane (SOml) for 2Vi hours, washed with 
aliquois of dichloromethane (100 ml total) and anhydrous ether (100 ml total) and dried in 
vacuo. To the CPO support was then added anhydrous pyridine (35 ml), dimethylamino- 
pyridine (42 mg, 344 ftmol), trieihylamine (280 /il, 201 mmol) and compound (8) (see 
scheme 1) (736 mg, 700 itmol). The mixture was then gently agitated for a total of 18 hours 
after which time the beads were given multiple washes of pyridine (7 x 10 ml), methanol (5 
X 15 ml) and chloroform (5 x 15 mO and then dried in vacuo. 



EXAMPLE 8 

Methylation of the tertiary amino groups attached 
to the CPG support (Compound 10, Scheme 2) 

To the derivatised long chain alkylamino controlled pore glass (Compound 9, Scheme 2) 
(l.Olg) was added anhydrous THF (10 ml) and iodomethane (0.5 ml, 8 mmol). The mixture 
was then gendy agitated for a total of 18 hours after which time the beads were given 
multiple washes of anhydrous THF (5 x 10 ml) and then dried in vacuo. 
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SCHEME 2 
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EXAMPLE 9 

Synthesis of mono protected 1»3 - propanediol derivatives 
(Compounds 12a and 12b, Scheme 3) • general procedure 

To sodium hydride (LOS g of a 60% dispersion in oil, 26.3 mmol) under was added 
anhydrous THF (10 ml) followed by dropwise addition of the 1 ,3-propanediol derivative 
(26.3 mmol) dissolved in anhydrous THF (20 ml). Stirring for an additional 30 minutes 
under ensured alkoxide formation as noted by the formation of a grey precipitate. The 
generated alkoxide was quenched by the dropwise addition of ten - butylchlorodiphenylisilane 
(7.24 g, 26.3 mmol) dissolved in anhydrous THF (20 ml) followed by stirring of the reaction 
under Nj for a further 40 minutes. The THF was then removed by rotary evaporation and 
the residue partitioned between dichloromethane and 0. 1 M sodium bicarbonate solution. The 
dichloromethane solution was separated off and washed with brine (xl). The dichloro- 
methane solution was then dried over magnesium sulphate and evaporated down to an oil. 
This oil was applied to a silica gel column (16 cm x 5 cm) which was eluted with an 
ether/pet. ether 40 -^^ 60^*0, 30%/70% mixture. The product fractions were combined and 
rotary evaporated down to provide the desired L3-propanediol derivative. 

For individual details of the compounds see below. 

12a l-O-zm-butyldiphenysilyM ,3-propanediol, while crystalline solid, Rp 0.21 ether/pet. 
ether 40 - 60X, 30%/70%. 7.61 g, 92%, mpt. 40 42^C. 

IR (KBr disc) 3400. 1448, 1112. 822. 734, 702, 689. 506, 488 cm 

'H NMR (270 MHz, CDCI3. 5): 1.06 (s. 9H. 'Bu). 1.80 (m,2H. CH^CH^ CH^). 2.45 (t, IH. 
OH), 3.84 (m, 4H.OCH2CH2CH2O-), 7.40 (m, 6H, aryl), 7.68 (m.4H, aryl). 

12b 2-meihyl-l-0-/er/-butyldiphenylsilyl-l ,3- propanediol. Colourless oil, Rp0.21 ether/pet. 
ether 40 60^C, 30%/70%, 6.60 g, 77%. 

IR (thin film) 3400, 1472, 1428, 1087. 1040, 823, 740, 702 cm ^ 

«H NMR (270 MHz, CDCI3, 6): 0.82 (d. 3H, J = 6.87 Hz, CH3), 1.06 (s,9H. 'Bu), 2.0 (m. 
IH, CH - CH3), 2.68 (I, IH. OH). 3.64 (m. 4H. CHj CH (CH3) CH2). 7.40 (m,6H. aryl). 
7.68 (m, 4H, aryl). 

See P G McDougal et aUOC, ii, 3388 (1986) for general procedures for the monosilylation 
of symmetric l.n-diols. 
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EXAMPLE 10 

Synthesis of the tresiate derivatives (Compounds 13a and 13b, 
Scheme 3)-generaI procedures 

To the alcohol derivative (4.94 mmol) dissolved in anhydrous dichloromethane (10 ml) and 
dry triethylamine (0.84 ml 6.03 mmol) under N, and cooled to between -IS" -* -SO'C was 
added the tresylchloride (Ig, 5.48 mmol) in anhydrous dichloromethane (5 ml) dropwise over 
a 20 40 minutes. Stirring for an additional 30 minutes under Nj at -15' -» -30'C 
completed the reaction. The reaction mixture was then" transferred to a separatory funnel and 
washed with ice «)oled l.OM hydrochloric acid (xl), ice cooled water (xl) and ice cooled 
brine (xl). The dichloromethane solution was then dried over magnesium sulphate and the 
solvent rotary evaporated off to give the tresiate. The treslates were stored at -20'*C under 
Nj until required. 



For individual details of the compounds see below. 



i3a l-0-«/r-butyldiphenylsilyl-3-0-tresyl-l,3-propanediol. White crystalline solid. 1.74g, 
77% mpt. 34 -» SS'C. Three ml of this reaction mixture was removed prior to work up of 
the reaction for addition to other reactions. 

'H NMR (270 MHz. CDCI,, 5): 1.06 (s, 9H. 'Bu). 1.97, (m, 2H, CHjCHjCH,), 3.77 (t. 2H. 
J= 5.49 Hz. CHrO)-Si). 3.84 (q. 2H. J= 8.79 Hz. CF,-Ca-0), 4.54 (t, 2H, J = 6.05 Hz. 
Tresyl 0-CHj), 7.42 (m, 6H, aryl), 7.64 (m. 4H, aryl). 

IR (KBr disc) 1386, 1329. 1274. 1258. 1185. 1164, 1137, 1094, 941, 763, 506 cm ". 

12h2-methyl-l-0-/e/T-butyldiphenylsiIyl-3-0-tresyH ,3-propanediol. Colourless oil, 2.57 g 
99%. ^ 

'H NMR (270 MHz. CDCI,. 5): 0.97 (d, 3H. J = 6.87 Hz, CH,), 1.06 (s, 9H, «Bu), 2.10 
(m, IH. CHCH,), 3.6 (m, 2H. CifcOSi). 3.8 (q. 2H. J = 8.79 Hz, CF, CHj), 4.40 (m, 2H, 
Tresyl-O-CHj), 7.40 (m, 6H. aryl). 7.64 (m. 4H, aryl). 



For general details of Treslates see R K Crossland et al JACS, 22, 4217 (1971). 
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EXAMPLE 11 

Synthesis of N-lacetoxy-2-oxyethyl]-N-IO-(3(0-(4,4". 
dimethoxytrityl)-l-oxynicthyl)-4*nUrophenyI)-3-oxypropyI]*N- 
2-hydroxyethyl] amine (Compound 15, Scheme 4) 

To compound 11 (1.72 g, 1.92 mmol) dissolved in anhydrous THF (20 ml) was added 
tetrabutylammonium fluoride {0.55 ml of a IM solution in THF, 1.92 mmol). The reaction 
was then stirred for a total of two hours at room temperature. The reaction mixture was then 
diluted with water (50 ml) and the THF removed by rotary evaporation. The aqueous 
solution was then extracted with chloroform (xl). The organic solution was dried over 
anhydrous sodium sulphate and evaporated down to a gum. The product was purified by 
silica gel chromatography eluting the column with ethyl acetate. Product fractions were 
combined and rotary evaporated down to give compound 12 as a colourless gum which 
slowly crystallised on standing; 0.73 gt 58%, mpt. 95 9TC, Rp 0.26 ethyl acetate. 

•H NMR (270 MHz, CDCI3, 5): 1.75 (brs, IH, OH), 2.0 2.1 (m, 5H, 
O2CCH3+CHjCHbCH0, 2.70 2,81 (m, 6H, CH2N), 3.58 (m. 2H, CJiOSi). 3.79 (s, 6H, 
DMT-OCHa), 4.17 (m, 4H, CH2O), 4.64 (s, 2H, ArCHjODMT), 6.83 (d, 4H, DMT-aryl) 
7.2 7.5 (m, lOH, aryl), 7.69 (s, IH, aryl), 8.10 (d. IH, aryl). 

IR (KBr disc), 3459. 1738, 1608. 1577. 1506. 1444, 1313, 1288, 1250. 1230. 1175. 1154. 
1070. 1035. 984 cm *. 



gXAMPLE 12 

Synthesis of N-[0-(/e/t- butyIdiphenylsiIyI).2-oxyethyll 
-N-[0-(3(0-(4,4*-dtmethoxytrityl)-l-oxymethyl)^-nitrophenyI) 
-3-oxypropyl)-N-[acetoxy-2-oxyelhyI] amine (Compound 14, Scheme 4) 

To compound 6 (1.73 g, 2.02 mmol) dissolved in anhydrous pyridine (10 ml) was added 
acetic anhydride (0.5 ml, 4.54 mmol) and 4-dimethyIaminopyridine (55 mg, 0.45 mmol) and 
the flask stoppered. The reaction mixture was then stirred at room temperature for a total 
of 16 hours after which time t.Lc. analysis (methanol/ethyl acetate 5%/95%) showed the 
complete disappearance of the starting material and the formation of a new trityl containing 
spot, Rp 0.80. The pyridine was removed by rotary evaporation with the last traces being 
removed with co-evaporation with toluene (x2). The resultant gum was partitioned between 
chloroform and water. The chloroform solution was separated off and washed with brine 
(xl). The chloroform solution was then dried over anhydrous magnesium sulphate and the 
solvent rotary evaporated off to give a colourless gum, 1.94 g. This material was pure 
enough to be used in the next reaction without any further purification. 

•H NMR (270 MHz, CDCI3, S): 1 .04 (s, 9H, 'Bu), 1.9 (m. 2H, CHjCifcCH^). 2.01 (s, 3H, - 
QaCCHj), 2.74 (m, 6H, CftN). 3.7 (m, 2H. CHjOSi), 3.8 (s, 6H, DMT-OCHa), 4,1 (m,4. 
CHaO). 4.63 (s. 2H, ArCHjODMT). 6.78 (d, IH, aryl) 6.83 (d, 4H, DMT aryl), 7.2 - 7.8 
(m 20H aryl). 8.05 (d, 2H. aryl) 



wo 9smm 



PCT/GB94/01675 



- HO - 

EXAMELE 13 

Synthesis of N-[acetoxy-2-oxyethyl]-N-[0-(3(0-(4,4'-diinethoxytrityl) 
-l-oxymethyl)-4-nitrophenyl)-3-oxypropyll-N-[0-«ert-butyldiphenylsUyI)-3-oxo-6- 
oxyhexyl] amine (Compound 16, Scheme 4) 

To compound 12 (66 mg, 0. 10 mmol) dissolved in anhydrous acetonitrile (S ml) was added 
potassium carbonate (SS mgs, 0.4 mmol and compound 13a (1 ml of the reaction mixture, 
approximately 0.30 mmol) and the flask then stoppered with a calcium chloride drying tube. 
Tht reaction mixture was then stirred at room temperature for a total of 22 hours after which 
time the potassium carbonate was filtered off and the solvent removed by rotary evaporation. 
The resultant oil was then applied to a silica gel column (14 cm x 1 cm) and the product 
eluted off with an ether/pet. ether 40 -* 60°C, 75 %/2S % mixture. The pure product fractions 
were combined and evaporated down to a clear gum, 6 mg, 6%, Rp 0.47 in ether/ pet. ether 
40-60»C, 80%/20%. 

•H NMR (270 MHz, CDCi,, 5): 1.05 (s. 9H. 'Bu). 1.8 (m, 2H. CHjCHCH,OSi). 1.9 (m, 
5H, OjCCHj + ArOCHj-), 2.76 - 2.92 (m, 6H, CHjN). 3.51 (t, 2H, J =6.6 Hz, 
0CHiCHaCH20Si), 3.79 (s. 6H, DMT-OCH,) 3.85 (m, 2H, CHjOSi), 4. 12 - 4.23 (m, 4H, 
ArOCHaCHj + NCHjCHiOCOCH,), 4.64 (s, 2H, ArCHjODMT), 6.83 (m, 5H, 1 aryl + 
DMT-aryl), 7.23 -* 7.50 (m, 16H, aryl), 7.68 (m, 4H, aryl), 8.10 (d, IH, J=9.06Hz, aryl). 

By analogues reaction conditions to the above the following compound has also been 
synthesised utilising the treslate 13b. 

N-tacetoxy-2-oxyethyI]-N-[0-(3(0(4,4' - dimethoxytrityl) -l-oxymethyl)-4-nitrophenyl)-3- 
oxypropyl]-N[0-(^crf-butyldiphenylsilyl)-5-methyl-3-oxo-6-oxyhexyl]amine. Thecompound 
is a clear gum, Rp 0.53 in ether/pet. ether 40 60°C, 80%/20%. 

•H NMR (270 MHz, CDCI,, 5): 0.88 (d. 3H, CH-CHj), 1. 00 (s, 9H, 'Bu), 1.9-2.1 (m, 
6H, OjCCHj + CH-CHj + CHjCHjCHj), 2.7 - 3.0 (m, 6H. CHiN), 3.4 3.7 (m, 4H, 
CHjO-), 3.79 (s, 6H, DMT-OCHj), 4.0 - 4.4 (m, 6H, CH^O-), 4.64 (s, 2H, 
Ar CHjODMT), 6.83 (m, 5H, aryl), 7.2 - 7.7 (m, 20H, aryl), 8.01 (d, IH, aryl). 
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Example 14 

Synthesis of oligonucleotides on solid supports 

Controlled pore glass carrying linkers 9 and 
10 (compounds 9 and 10 in Scheme 2) was loaded into the 
columns used in the automatic oligonucleotide 
synthesiser (ABI 381A); the amounts used provided for 
0.2 or Ipmol scale synthesis. The columns were 
inserted in the automatic synthesiser which was then 
programmed for appropriate cycles. Two different types 
of nucleotide precursors were used: normal 
phdsphoramidites, with dimethoxytrityl protecting 
groups on the 5' hydroxyls; "reverse synthons" with 5' 
phosphoramidites and dimethoxytrityl protecting groups 
on the 3' hydroxyls. A list of oligonucleotides 
synthesised on these supports in shown in Table 4 in 
which R9 and RIO derive from compounds 9 and 10 
respectively. Yields were monitored from the amount of 
dimethoxytrityl group released at each coupling. These 
yields corresponded to those obtained on the CPG 
supports used for conventional oligonucleotide synthesis. 

Table 4 

End group (s) Sequence Normal direction Reverse direction 

R9 T^ / 

RIO ^5 / 

RIO, DMT Tg / 

R9, DMT Tg / 

RIO, DMT Tg / 

RIO A^Q / 

Exapipl^ 15 

Synthesis of tags under conditions which leave the 
analyte intact. 

After synthesising 5 'R9T^ on support 9, the 
solid support was divided, part was treated with 5mM 
tetrabutylammonium fluoride in THF for 10 min. at room 
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temperature to remove the t-butyldiphenylsilyl 
protecting group. Both samples were treated with 29% 
ammonia at room temperature overnight to remove the 
products from the solid support. Ammonia was removed 
5 under vacuum, and the solid residue dissolved in water. 
HPLC showed the successful removal of the silyl 
protecting group with retention of the DMT group. This 
example shows that the two protecting groups can be 
removed under conditions which leave the other in 
10 place; and further, that removal of the protecting 
groups leaves the oligonucleotide chain intact. 

Example 16 

Biochemical reactions of tagged analytes. 

15 16a. Enzymatic phosphorylation of tagged 

oligonucleotides. For many purposes, it will be useful 
to have oligonucleotides which have a phosphate group 
at the 5 'end. Such a group is necessary if the 
oligonucleotide is to be used as the donor in a 

20 ligation reaction; and it is a useful way of 

introducing a radioactive group to test biochemical 
properties. 

The oligonucleotides A5, A-^q^ and T5 were 
made with the tags R9 and R10 attached to the 3' ends, 

25 with and without the silyl protecting group removed 
(This was achieved by treating the oligonucleotide, 
still on the solid support, with a 5mM solution of 
tetrabutylammonium fluoride in acetonitrile, at room 
temperature for 15 min.) These oligonucleotides were 

20 phosphorylated using T4 polynucleotide kinase and 

gamma --^^P- ATP using standard protocols recommended by 
the supplier. Thin layer chromatography of the 
products on polyethyleneimine (PEI) impregnated 
cellulose developed in 0.5M ammonium bicarbonate showed 

25 in each case that the labelled phosphorus had been 

transferred almost completely to the oligonucleotide. 
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16b, Ezymatic ligation of tagged oligonucleotides. 

For some applications of tagged 
oligonucleotides, it will be useful to ligate them to a 
receptor. We have shown that tagged oligonucleotides 
5 can take place in enzymatic ligation by the following 
tests : 

(1) Oligonucleotides tagged at the 5'-end. 
In this test^ the template was 5' 
ATCAAGTCAGAAAAATATATA (SEQ ID No. 1). This was 
10 hybridised to the donor, 3' TAGTTCAGTC (SEQ ID No. 2), 
which had been phosphorylated at its 5 '-end using 
radioactive phosphorus. Four ligation reactions were 
carried out, each with a modification of the sequence 
T5, which could ligate to the 5' phosphorylated end of 
-15 the donor after hybridising to the run of 5 A 's in the 
template. The four oligoT's used in the reactions 
differed in the nature of their 5 '-end. One had a 
dimethoxytrityl group attached through the hydroxyl. 
The second and third had tags R9 and RIO attached to 
the 5 '-end through a phosphodiester bond. The fourth 
was a positive control, with a normal 5 'OH. A negative 
control lacked any oligoT. Ligation reactions were 
performed using T4 ligase according to the suppliers' 
instructions. Reactions were analysed by TLC on PEI- 
25 cellulose, developed ip 0.75M ammonium bicarbonate 
solution. All four reactions showed an additional 
spot on the chroma togram, of lower mobility than the 
donor; as expected, the negative control showed no 
additional spot. This illustrates how oligonucleotides 
30 with different tags can take part in sequence-specific 
ligation reactions. 

Cozzarelli et al (1967) have shown that 
polynucleotides attached to solid supports can be 
ligated to an acceptor in the presence of a 
25 complementary template. 
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p^^iwpj.Q 17 

Hybridisation of tagged oligonucleotides to 
oligonucleotides tethered to a solid support. 

Example 16b shows that tagged oligonucleotides 
can take part in ligation reactions, inferring that 
they can also take part in duplex formation in 
solution, as ligation depends on this process. The 
following experiment shows that they can also form 
duplexes with oligonucleotides tethered to a solid 
support. was synthesised on the surface of a sheet 

of aminated polypropylene according to the 
manufacturer's instructions. It is known that this 
process yields around 10 pmols of oligonucleotide per 
mm^ . A solution in 3.5M tetramethylammonium chloride 
of (65 pmol per microlitre) , labelled at the 5 'end 

with "^P, and tagged at the 3' with RIO was laid on the 
surface of the derivatised polypropylene and left 
overnight at 4 * . After washing in the hybridisation 
solvent, it was found that around one third of the 
probe had hybridised to the tethered oligo-dT. This 
is close to the theoretical limit of hybridisation, 
showing that tagged oligonucleotides can take part in 
hybridisation reactions with high efficiency. 

EMmple 18 
Photolysis of tags. 

The potential to remove tags by photolysis 
would greatly enhance their usefulness: it would allow 
for direct analysis by laser desorption in the mass 
spectrometer; it would provide a simple method of 
removing the tags to allow other biochemical or 
chemical processes . 
18a. Bulk photolysis. 

The nitrobenzyl group is known to be labile to 
irradiation at 305nm. Solutions of RIGA, ,^ and RIOT- in 
water were irradiated at 2 cm. from a transilluminator 



wo 95/04160 PCT/GB94/01675 

- 46 - 

for 20 min. under conditions known to cause no 
detectable damage to nucleic acids. Analysis by HPLC 
showed the expected products of photocleavage, with no 
detectable residue of the original compound. 
18b. Laser induced photolysis in the mass spectrometer. 
^ Samples of R10T^ and T^RIO were deposited on 

the metal target of a time of flight mass spectrometer 
(Finnigan Lasermat) without added matrix. The spectrum 
showed a single saturated peak at around mass 243 in the 
positive mode that was absent in other samples. 



10 



15 



20 



Identification by mass spectrometry of 
different tags attached to different analytes. 



A sequence of five thymidine residues with a 
dimethoxytrityl group attached as a tag to the 3 'end 
was synthesised by conventional solid phase methods, 
but using "reverse synthons". In the mass 
spectrometer, this compound gave a large and distinct 
peak at mass 304, in the positive ion mode. By 
contrast a sequence of ten adenosine residues carrying 
the tag designated R10 above gave a large and distinct 
peak at mass 243 in the positive ion mode. In both 
cases, laser desorption was carried out in the absence 
of matrix. In both cases the peaks are absent from the 
oligonucleotides which have no tag. These examples 
show that it is easily possible to identify an analyte 
sequence from the presence of a peak in the mass 
spectrometric trace that derives from a tag 
incorporated during the synthesis of the analyte, and 
that characteristic tags are readily identified by 
their different mass. 
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FIGURE ^EggNPS 
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Fig. 1. General scheme for synthesis of molecules with 

specific tags. 

Synthesis starts from a linker (L) with at 

least one site for the addition of groups for 

synthesising the analyte and one for synthesising the 

tag. (The linker may also be attached reversibly to a 

solid support during synthesis, and may have sites for 

generating groups such as charged groups which may help 

in analysis) . and are temporary protecting 

groups for the analyte precursors and the reporters 

respectively; they will be removable by different 

treatments. For example, P may be an acid or base 

a 

labile group such as trityl, F-MOC or t-BOC, and P^ a 
group removable by treatment with fluoride such as a 
silyl residue. Groups U-Z may also have protecting 
groups which must be stable to the reagents used to 
remove P and P^. Coupling chemistries will be 
different for different analyte types; standard methods 
are available for oligonucleotide and peptide 
synthesis. 

Three different types of tags are described 
in Fig. 2. For the first scheme, each extension of the 
tag is carried out with a reporter which is specific 
for both position and type of residue added to the 
analyte. Capping is not important for this scheme. 

In the second and third schemes, position is 
defined by the total mass of reporter reached at the 
stage in synthesis when the residue is added to the 
analyte. In this case it is important to terminate 
part of the extension of the tag by capping a portion 
of the molecules. The second and third schemes differ 
from each other in the way the reporters are added. In 
the second they are in the extension agents; in the 
third they are in the caps. 
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Fig. 2. Three types of raolecule-specif ic tags, 

A. Illustrates tags made of reporters (E) that 
specify both position (subscript) and identity 
(supe^rscript) of the groups in the analyte (U-Z) . 
Such a set could comprise a series of aliphatic chains 
of increasing formula weight to specify position: for 
example, methylene for position 1, ethylene for 2, 
propylene for 3 etc. These could be differentiated into 
group- specific types by different isotopic compositions 
of carbon and hydrogen: for example, there are six 
different isotopic compositions of CH2/ as shown in 
Table 1 above. Four of these differ by one mass unit 
and should be readily distinguished by mass 
spectrometry. Other ways of differential labelling can 
be envisaged. For example, either position or group 
could be marked by reporter groups with different 
charges. Such groups can be separated and recognised by 
a number of methods including mass spectrometry. 

B. Shows tags made by partial synthesis, such 
that any structure of the analyte is attached to a 
series of tags; the first member of the series has a 
reporter group specific for the first group of the 
analyte; the second has the first reporter plus a 
second reporter specific for the second group of the 
analyte and so on. Such a series can readily be made 
by using two kinds of precursor for extending the tag: 
one which is protected by a reversible blocking group 
and one which prevents further extension. For example, 
a mixture of RX and 'B- {C^^^ where R is an non- 
reactive aliphatic group such as methyl or ethyl, P is 
a reversible protecting group and X is an activated 
residue that can react with the group protected by P. 
Those molecules which have been "capped" by the non- 
reactive aliphatic group will not take part in the next 
round of deprotection and extension. 

In B the group- specif ic information is 
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20 



25 



contained in the residues used to extend the synthesis. 
As in A, the information could be provided using mass 
isotopes. For example, every addition of a residue 
labelled with the isotopes of C and H to P-(CH2)j^X, 
adds further sites that can provide different mass to 
the reporter. The masses of the (CH2)j^ range from 14n 
to 17n and there are 4 + 3{n-1) different masses in the 
range. Thus for the ethylene group there are seven 
distinct masses in the range 28 to 34, and for the 
propylene group, ten in the range 42 to 51. 
C- Shows how the group-specific information can 

be added in a different way; in this case it is 
contained in the chain terminator, the "cap" in example 
B. Again, different masses could be provided by 
labelling an aliphatic residue. Positional information 
is provided by the length of the extension at which the 
terminator was added. Suppose that E is (CH^)^-O, and 
the terminators are isotopically labelled methyl groups 
with formula weights from 15 to 19. Each extension will 
add 44 mass units to the reporter. The mass range for 
the shortest reporter would be from 44 + 15 = 59 to 
44 + 19 = 63. The range for the second position would 
be from 88 + 15 = 103 to 88 + 19 « 107, and so on to 
the sixth where the range is from 284 + 15 = 299 to 
284 + 19 = 303, There is no overlap in this range, and 
it can be seen that the number of reporters and the 
range could be extended by using terminators and 
extensions with more atoms. 



30 



35 



LITERATURE CITED 



1. Brenner, S. and Lerner, R.A. (1992). 
Encoded combinatorial chemistry. Proc. Natl. Acad. 
Sci. USA 89: 5381--5383 

2. Drmanac, R., Labat, I., Brukner, I,, and 
Crkvenjakov, R. (1989). Sequencing of megabase plus 
DNA by hybridization: Theory of the method. Genomics 
4: 114 — 128. 

3. Pillai, V. N. R. (1980). Photoreraovable 
protecting groups in organic chemistry. Synthesis 39: 
1-26 

4. Hoheisel, J.D., Maier, E., Mott, R., 
McCarthy, L., Grigoriev, A.V., Schalkwyk, L.C., 
Nitzetic, D., Francis, F. and Lehrach, H. (1993) High 
resolution cosmid and PI maps spanning the 14 Mbp 
genome of the fission yeast Schizosaccharomyces pombe . 
Cell 73: 109-120. 

5. Khrapko, K. R., Lysov, Yu. P., Khorlyn, A. 
A., Shick, V. v., Florentiev, V. L., and Mirzabekov. 
(1989). An oligonucleotide hybridization approach to 
DNA sequencing. FEES Lett. 256: 118-122. 

6. Patchornik, A., Amit, B. and Woodward, R. B. 
' (1970). Photosensitive protecting groups. J. AMER. 

Chem. Soc. 92:21: 6333-6335. 

7. Ross, M»T., Hoheisel, J.D., Monaco, A*P., 
Larin, Z., Zehetner, G., and Lehrach, H. (1992) High 
density gridded Yac filters; their potential as genome 
mapping tools. In "Techniques for the analysis of 
complex genomes." Anand, R. ed. (Academic Press) 137- 
154. 

8. Southern, E. M. (1988). Analyzing. 
Polynucleotide Sequences. International Patent 
Application PCT GB 89/00460. 



wo 95/04160 



PCT/GB94/01675 



- 51 - 

9. ' Southern, E.M., Maskos, U. and Elder, J,K. 
(1992). Analysis of Nucleic Acid Sequences by 
Hybridization to Arrays of Oligonucleotides: Evaluation 
using Experimental Models. Genomics 12: 1008-1017. 

10. de Vries, M.S., Elloway, D.J., Wendl, R.H., 
and Hunziker, H.E. (1992). Photoionisation mass 
spectrometer with a microscope laser desorption source. 
Rev. Sci. Instrum, 63(6): 3321-3325. 

11. Zubkov, A. M., and Mikhailov, V. G. (1979). 
Repetitions of s-tuples in a sequence of independent 
trials. Theory Prob. Appl, 24, 269-282. 

12. Cozzarelli, N.R., Melechen, N.E., Jovin, T.M. 
and Kornberg, A. (1967). BBRC, 28, 578-586. 



20 



25 



30 



35 



wo 95/04160 



- 52 - 



PGT/GB94/01675 



SEQUENCE LISTING 



(1) GENEEIAL INPOEIMATION: 

(i) APPLICANT: 

(A) NAME: ISIS INNOVATION LIMITED 

(B) STREET: 2 South Parks Road 

(C) CITY: Oxford 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): 0X1 3UB 

(A) NAME: SOUTHERN, EDWIN 

(B) STREET: 30 Staverton Road 

(C) CITY: Oxford 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): 0X2 6XJ 

(A) NAME: CUMMINS, WILLIAM JONATHAN 

(B) STREET: 5 Thorntree Drive 

(C) CITY: Tring 

(D) STATE: Herts, 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): HP23 4JE 

(ii) TITLE OF INVENTION: TAG REAGENT AND ASSAY METHOD 
(iii) NUMBER OF SEQUENCES: 2 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 (EPO) 

(Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9315847.5 

(B) FILING DATE: 30-JUL-1993 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ATCAAGTCAG AAAAATATAT A 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CTGACTTGAT 
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CLAIMS 
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1 . A reagent coraprising 

a) an analyte moiety comprising at least two 
analyte residues, and linked to 

b) a tag moiety coraprising one or more reporter 
groups adapted for detection by mass spectrometry, 

wherein a reporter group designates an 
analyte residue^ and the reporter group at each 
position of the tag moiety is chosen to designate an 
analyte residue at a defined position of the analyte 
moiety. 

2. A reagent as claimed in claim 1, wherein 
there is provided a linker group to which is attached 
the analyte moiety and the tag moiety. 

3. A reagent as claimed in claim 1 or claim 2, 
wherein the analyte moiety is a chain of n analyte 
residues, and the tag moiety is a chain of up to n 
reporter groups, the reporter group at each position of 
the tag chain being chosen to designate the analyte 
residue at a corresponding position of the analyte 
chain . 

4. A reagent as claimed in any one of claims 1 
to 3, wherein the analyte moiety is linked to the tag 
moiety by a photocleavable link. 

5. A reagent as claimed in any one of claims 1 
to 4, having a formula A - L - R where A is a chain of 
n analyte residues constituting the analyte moiety, L 
is the linker, R is a chain of up to n reporter giroups 
constituting the tag moiety, and n is 2 - 20, wherein 
the tag moiety contains information defining the 
location of analyte residues in the analyte moiety. 

6. A reagent as claimed in any one of claims 2 
to 5, wherein the linkeir comprises an aromatic group 
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carrying a hydroxy, amino or sulphydryl group for 
analyte moiety synthesis, and a reactive group for tag 
moiety synthesis 

7. A reagent as claimed in claim 6, wherein 
5 the aromatic group carrying a hydroxy, amino or 

sulphydryl group for analyte moiety synthesis, also 
carries an o-nitro group for photocleavage . 

8. A reagent as claimed in any one of claims 1 
to 7, wherein there is present a charged group for 

10 analysis by mass spectrometry.. 

9. A reagent as claimed in any one of. claims 1 
to 8, wherein the analyte moiety is a peptide chain. 

10. A reagent as claimed in any one of claims 1 
to 8, wherein the analyte moiety is an oligonucleotide 

15 chain. 

11. A library of the reagents claimed in any one 
of claims 1 to 10, wherein the library consists of a 
plurality of reagents each comprising a different 
analyte moiety. 

20 ^2. A library as claimed in claim 11, wherein the 

library consists of 4^ reagents each comprising a 
different analyte moiety which is a different 
oligonucleotide chain of n nucleotides. 

13. A library as claimed in claim 12, wherein the 
25 reagents are mixed together in solution. 

14. An assay method which comprises the steps of: 
providing a target substance; incubating the target 
substance with the library of reagents claimed in any 
one of claims 11 to 13 under conditions to cause at 

30 least one reagent to bind to the target substance; 
removing non-bound reagents; recovering the tag 
moieties of the or each bound reagent; and analysing 
the recovered tag moieties as an indication of the 
nature of the analyte moieties bound to the target 

35 substance. 

15. An assay method as claimed in claim 14, 
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wherein the target substance is an organism or tissue 
or group of cells. 

16. A method of sequencing a target nucleic acid, 

which method comprises the steps of: 
5 a) providing an oligonucleotide immobilised on a 

support, 

b) hybridising the target nucleic acid with the 
immobilised oligonucleotide, 

c) incubating the hybrid from b) with the 

10 library claimed in claim 13, so that an oligonucleotide 
chain of a first reagent of the library becomes 
hybridised to the target nucleic acid adjacent the 
immobilised oligonucleotide, 

d) ligating the adjacent oligonucleotides, thus 
15 forming a ligated first reagent, 

e) removing other non-ligated reagents, and 

f ) recovering and analysing the tag moiety of 
the ligated first reagent as an indication of the 
sequence of a first part of the target nucleic acid. 

20 17. A method as claimed ixi claim 16, comprising 

the additional steps of 

ci) incubating the hybrid from f ) with the 

library claimed in claim 13, so that an oligonucleotide 
chain of a second reagent of the library becomes 
25 hybridised to the target nucleic acid adjacent the 
oligonucleotide chain of the first reagent, 
di) ligating the adjacent oligonucleotides, thus 

forming a ligated second reagent, 

ei) removing other non-ligated reagents, and 

30 fi) recovering and analysing the tag moiety of 

the ligated second reagent as an indication of the 
sequence of a second part of the target nucleic acid. 
18. A method as claimed in claim 16 or claim 17, 

wherein: in step a) the oligonucleotide is immobilised 
25 on the ends of a series of pins as the support; in 

step b) an individual clone of target DNA is hybridised 
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to the oligonucleotide immobilised on each individual 
pin; in steps c) and d) there are formed a series of 
ligated reagents, with different pins carrying 
different ligated reagents; and in step f) the tag 
5 moiety of each ligated reagent is recovered and 

analysed as an indication of the sequence of a part of 
the target DNA. 

19. A method as claimed in claim 16 or claim M, 
wherein: in step b) each individual clone of target 

10 I>NA is hybridised to the oligonucleotide immobilised at 
an individual spaced location of the support; in steps 
c) and d) there are provided a series of ligated 
reagents with different spaced locations of the support 
carrying different ligated reagents; and in step f) 

15 the tag moiety of each ligated reagent is recovered and 
analysed as an indication of the sequence of a part of 
the target DNA. . 

20. A method as claimed in claim 16 or claim 17^ 
wherein the method comprises the steps of: 

20 ^) providing an array of oligonucleotides 

immobilised at spaced locations on a support, an 
oligonucleotide at one location being different from 
oligonucleotides at other locations, 

b) incubating the target nucleic acid with the 
25 array of immobilised oligonucleotides, so as to form 

hybrids at one or more spaced locations on the support, 

c) incubating the hybrids from b) with the 
library claimed in claim 13, so that an oligonucleotide 
chain of a reagent of the library becomes hybridised to 

30 the target nucleic acid adjacent each immobilised 
oligonucleotide, 

d) ligating adjacent oligonucleotides, thus 
forming ligated reagents at the one or more spaced 
locations on the support, 

35 e) removing other non-ligated reagents, and 

f) recovering and analysing the tag moiety of 
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each ligated reagent as an indication of the sequence 
of a part of the target nucleic acid. 

21. A method as claimed in claim 20, wherein the 
sequence is known of the oligonucleotide immobilised by 

5 a covalent bond at each spaced location on the support. 

22. A method of analysing a target DNA, which 
method comprises the steps of: 

i) providing the target DNA immobilised on a 

support, 

10 ii) incubating the immobilised target DNA from i) 

with a plurality of the reagents claimed claim 10, so 
that the oligonucleotide chains of different reagents 
become hybridised to the target DNA on the support, 
iii) removing non-hybridised reagents, and 

15 iv) recovering and analysing the tag moiety of 

each reagent as an indication of the sequence of a part 
of the target DNA, 

23. A method as claimed in claim 22, comprising 
the additional steps of: 

20 iia) incubating the hybrid from iv) with the 

library of reagents claimed in claim 13, so that 
oligonucleotide chains of different reagents become 
hybridised to the target DNA, 

iiia) ligating adjacent oligonucleotides hybridised 

25 to the target DNA and removing non-ligated reagents, 
and 

iva) recovering and analysing the tag moiety of 

each ligated reagent as an indication of the sequence 
of part to the target DNA. 

30 24. A method as claimed in claim 22 or claim 23, 

wherein individual clones of the target nucleic acid 
are immobilised at spaced locations on the support, 
whereby in step ii) the oligonucleotide chains of 
different reagents become hybridised to the target 

25 nucleic acid at different spaced locations on the 
support . 
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25, A method as claimed in any one of claims 14 
to 24, wherein each tag moiety is recovered by 
photocleavage from its associated reagent. 

26, A method as claimed in any one of claims 14 
5 to 25, wherein the tag moiety is analysed by mass 

spectrometry. 

27, Assay equipment comprising: a support having 
two or more spaced locations thereon; individual clones 
of a target nucleic acid immobilised at the spaced 

10 locations on the support; and different reagents 
according to claim 10 hybridised to the individual 
clones of the target nucleic acid at the spaced 
locations on the support. 
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Fig.1. 



GENERAL SCHEME FOR SYNTHESIS OF MaECULES WrTH SPECIFIC TAGS 



Pa-<.'-^Pr 



jDeblock Pa 
Couple group *— (iV-Pr 

PaU-{r)-Pr 

jDeblock Pr 

f^U- ^iry-* Couple report er 

(and cap) 

P3U-(L>E^P,>PaU^T)-El{c" 
Repeat 



Z-Y-X- W-V- U -(l}- Ey-E^-^-Ej-^El 



SUBSriTUTE SHEET (RULE 26) 
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Fig.2. ^'^ 

A THREE TYPES OF MOLECULE-SPEOFIC TAGS 

POSITION- AND GROUP -SPECIFIC REPORTERS 
Z_Y_X-W-V-U^T>Ey-E|-^-E^^ 

I Cleavage and 
Fragmentation 

_U cV r-W t-x ci C? 

Analysis 



" GROUP-SPECIFIC REPORTERS-POSITIONS SPECIFIED BY PARTIAL PRODUCTS 
Z- Y- X-W- V- U -(T)- E" 

z-y-x-w-v-u-<T)-e"-e^ 

Z_Y-x-W-V-UHfLyKEV-E^E^--Ey-EZ 

1 Cleavage 

EU ^ eV^ E^-^^-E^E'^Ey-EZ 

, , — ' 

Analysis 

c. 

GROUP-SPECIFIC END-UBELS-POSmONS SPECIFIED PARTIAL PRODUCTS 
Z-Y-X-W-V-U"(r)~E"^" 
Z-Y-X-W-V- U-^r)-E -E -C^ 

Z-Y-X-W-V- U-(r)-E -E -E -E -E-E-C^ 

■* - 

I Cleavage 

E-C"* E-E-C^* E-E-E-E-E-E-C^ 

SUBSTITUTE SHEET (RULE 26) 
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Fig.3a. 

SYNTHESIS OF CODED OLIGONUCLEOTIDES 



rr r 

C_ G 



Oligonucleotide 
precursors 




Bivalent 
linker 



100 



I Couple first base 

K> 



ii 



Jj 



Coding 



^. ^ . precursors 
e first code ^ 



A T G C C T 



' jCoupli 

^ I Repeat cycle with second base 
*and code 

100% / vis 
\ / ioVo 

jetc.to n-base/code pairs 
Break linker 



C T 



Fig.3b. 

READING THE CODE 



o 



G C C T 



Read code 



Ti 
CT 

CCT 
GCCT 
JaaSi TGCCT 
JsaiLATGCCT. 



Sequence can 
be read from 
composition of 
partial 
products 



SUBSTITUTE SHEET (RULE 26) 
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Fig.4. 



I I I I I I I i I I I I 

T A C G G A 

A T G C C T 

I I i I 1 I 




Read code and 
generai-e OH group 



I I I I I I I I I I 1 I 
TACGGACGCGTT 
A T G C C T 

I I I I I 



Hybridise oligonucleotide 
pool 



I I I I I I I I 
TACGGACG 
A T G C C T 

I I I Iqhg 



TT 

C G 



T T 



C G C A A 



■O 



Ligate next 
oligonucleotide 



TACGGACGCGTT 
ATGCCTGCGCAA 

I I I I t I I 1 



SUBSTITUTE SHEET (RULE 26) 
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Fig.5. 

sequence to be read 

I I I I M I I M I I 

TACGGACGCGTT 



^ A T G C C T 



tethered 
^ oligonucleotide 



OH 



Hybridise template to tethered 
oligonucleotides 



FT 



TACGGACGCGTT 

^ A T G C C T 

I I I I 1 I 

OH 

Hybridise oligonucleotide 



pool 



i 



TT. 

TACGGACGCG 
A T G C C T 

^ I I I I I IqhG C G c a a 

I pi I I I I ■ 



o 



Ligate 

oligonucleotide 



I I I I I I M I M I 

TACGGACGCGTT 
ATGCCTGCCCAA 
I I I I I I 1 I M I I Q 



2M> 



Read code 



a 

AA 
CAA 
GCAA 
CGCAA 



fiM GCGCAA 
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